Loan Data Exploration by Allan Visochek ## Investigate The following variables Borrower Attributes: >CreditGrade >CreditScoreRangeLower >EmploymentStatus >IsBorrowerHomeOwner >LoanMonthsSinceOrigination >StatedMonthlyIncome >IncomeRange >BorrowerState >Occupation >DebtToIncomeRatio Loan Attributes: >Term >BorrowerRate >LoanOriginalAmount ========================================================
setwd(‘Documents/data_science/p3/final_project/’) loanData<-read.csv(‘../data/prosperLoanData.csv’) loanData\(Term <-factor(loanData\)Term) loanData\(HasCreditGrade <- !(loanData\)CreditGrade==’’) loanData\(DebtLevel <- cut(loanData\)DebtToIncomeRatio,c(0,.3,.49,1,10.5)) loanData\(LoanPeriod <- cut(loanData\)LoanMonthsSinceOrigination,c(0,13,56,65,105)) loanData\(LoanPeriod2 <-cut(loanData\)LoanMonthsSinceOrigination,c(0,55,105),labels=c(“post-recession”,“pre-recession”)) loanData loanData\(BadCredit <-loanData\)CreditScoreRangeLower<600 loanData\(DebtLevelBucket <- cut(loanData\)DebtToIncomeRatio,c(0,0.49,1,10))
loanData\(LoanOriginalAmountBucket <- cut(loanData\)LoanOriginalAmount,seq(0,25000,1000)) #```
## Warning: position_stack requires constant width: output may be incorrect
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.1340 0.1840 0.1928 0.2500 0.4975
## 10%
## 0.09886
## 90%
## 0.3099
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6500 8337 12000 35000
## Employed Full-time Not available Not employed
## 2255 67322 26355 5347 835
## Other Part-time Retired Self-employed
## 3806 1088 795 6134
## False True
## 56459 57478
## AK AL AR AZ CA CO CT DC DE FL GA
## 5515 200 1679 855 1901 14717 2210 1627 382 300 6720 5008
## HI IA ID IL IN KS KY LA MA MD ME MI
## 409 186 599 5921 2078 1062 983 954 2242 2821 101 3593
## MN MO MS MT NC ND NE NH NJ NM NV NY
## 2318 2615 787 330 3084 52 674 551 3097 472 1090 6729
## OH OK OR PA RI SC SD TN TX UT VA VT
## 4197 971 1817 2972 435 1122 189 1737 6842 877 3278 207
## WA WI WV WY
## 3048 1842 391 150
## Accountant/CPA
## 3588 3233
## Administrative Assistant Analyst
## 3688 3602
## Architect Attorney
## 213 1046
## Biologist Bus Driver
## 125 316
## Car Dealer Chemist
## 180 145
## Civil Service Clergy
## 1457 196
## Clerical Computer Programmer
## 3164 4478
## Construction Dentist
## 1790 68
## Doctor Engineer - Chemical
## 494 225
## Engineer - Electrical Engineer - Mechanical
## 1125 1406
## Executive Fireman
## 4311 422
## Flight Attendant Food Service
## 123 1123
## Food Service Management Homemaker
## 1239 120
## Investor Judge
## 214 22
## Laborer Landscaping
## 1595 236
## Medical Technician Military Enlisted
## 1117 1272
## Military Officer Nurse (LPN)
## 346 492
## Nurse (RN) Nurse's Aide
## 2489 491
## Other Pharmacist
## 28617 257
## Pilot - Private/Commercial Police Officer/Correction Officer
## 199 1578
## Postal Service Principal
## 627 312
## Professional Professor
## 13628 557
## Psychologist Realtor
## 145 543
## Religious Retail Management
## 124 2602
## Sales - Commission Sales - Retail
## 3446 2797
## Scientist Skilled Labor
## 372 2746
## Social Worker Student - College Freshman
## 741 41
## Student - College Graduate Student Student - College Junior
## 245 112
## Student - College Senior Student - College Sophomore
## 188 69
## Student - Community College Student - Technical School
## 28 16
## Teacher Teacher's Aide
## 3759 276
## Tradesman - Carpenter Tradesman - Electrician
## 120 477
## Tradesman - Mechanic Tradesman - Plumber
## 951 102
## Truck Driver Waiter/Waitress
## 1675 436
## A AA B C D E HR
## 29084 14551 5372 15581 18345 14274 9795 6935
## A AA B C D E HR NC
## 84984 3315 3509 4389 5649 5153 3289 3508 141
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0 660.0 680.0 685.6 720.0 880.0 591
## $0 $100,000+ $1-24,999 $25,000-49,999 $50,000-74,999
## 621 17337 7274 32192 31050
## $75,000-99,999 Not displayed Not employed
## 16916 7741 806
## Warning: position_stack requires constant width: output may be incorrect
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 3200 4667 5608 6825 1750000
## Warning: position_stack requires constant width: output may be incorrect
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 0.140 0.220 0.276 0.320 10.010 8554
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
There are 113,937 loans in the dataset with 81 features, 13 of which were used in the analysis:
BorrowerRate CreditScoreRangeLower DebtToIncomeRatio LoanOriginalAmount StatedMonthlyIncome
60,36,12 >(note that term was a numerical variable but was transformed to a factor variable because it has very few values)
AA,A,B,C,D,E,HR,NC,none
AA,A,B,C,D,E,HR,none
$75,000-99,999 ; $50,000-74,999 ; $1-49,999 1-24,999 ; $0 ; Not employed; Not displayed
Employed, Full-time, Not employed, Part-time, Retired, Self-employed none,Not available, Other,
Accountant/CPA, Administrative Assistant, Analyst, Architect, Attorney, Biologist, Bus Driver, Car Dealer, Chemist, Civil Service, Clergy, Computer Programmer, Construction, Dentist, Doctor, Engineer - Chemical, Engineer - Electrical, Engineer - Mechanical, Executive, Fireman, Flight Attendant, Food Service, Food Service Management, Homemaker, Investor, Judge, Laborer, Landscaping, Medical Technician, Military Enlisted, Military Officer, Nurse (LPN), Nurse (RN), Nurse’s Aide, Other, Pharmacist, Pilot - Private/Commercial, Police Officer/Correction Officer, Postal Service, Principal, Professional, Professor, Psychologist, Realtor, Religious, Retail Management, Sales - Commission, Sales - Retail, Scientist, Skilled Labor, Social Worker, Student - College Freshman, Student - College Graduate Student, Student - College Junior, Student - College Senior, Student - College Sophomore, Student - Community College, Student - Technical School, Teacher, Teacher’s Aide, Tradesman - Carpenter, Tradesman - Electrician, Tradesman - Mechanic, Tradesman - Plumber, Truck Driver, Waiter/Waitress
AK, AL, AR, AZ, CA, CO, CT, DC, DE, FL, GA, HI, IA, ID, IL, IN, KS, KY, LA, MA, MD, ME, MI, MN, MO, MS, MT, NC, ND, NE, NH, NJ, NM, NV, NY, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, VT, WA, WI, WV, WY
The majority of loans go to individuals who are employed full time. Few loans are given out to individuals with low income, or who are unemployed. Most Borrowers do not have a credit grade. Loans amounts range from 0 to $35,000. 75% of loans are for under $12,000. Nearly all of the loans given out are 100% funded. Most loans have 0 net principal loss. Most borrowers have 0 delinquincies in the past 7 years. Few loans are given out from Q4 2008 through Q2 2009.
The main features of interest are LoanOriginationQuarter and BorrowerRate.
It is hard to say at this point. All variables mentioned above were selected for the investigation because they are likely to have an impact on the borrower rate.
loanData\(BorrowerRateCategory<-cut(loanData\)BorrowerRate,c(0,0.1,0.31,0.5),labels=c(‘low’,‘normal’,‘high’))
loanData\(HasCreditGrade <- !(loanData\)CreditGrade==’’|is.na(loanData\(CreditGrade)) loanData\)HasProsperRating <- !(loanData\(ProsperRating..Alpha.==''|is.na(loanData\)ProsperRating..Alpha.))
HasIncome <-loanData$IncomeRange == “\(75,000-99,999"|loanData\)IncomeRange ==”\(50,000-74,999"|loanData\)IncomeRange == “\(25,000-49,999"|loanData\)IncomeRange ==”$1-24,999"
loanData\(DebtLevel <- cut(loanData\)DebtToIncomeRatio,c(0,.3,.49,1,10.5))
debt to income…. stated monthly income…
I reordered the following factor variables:
ProsperScore..Alpha, CreditGrade, Incomerange, LoanOrginationQuarter
## Term: 12
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0400 0.0929 0.1434 0.1501 0.2064 0.2669
## --------------------------------------------------------
## Term: 36
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.1274 0.1815 0.1935 0.2599 0.4975
## --------------------------------------------------------
## Term: 60
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0669 0.1490 0.1870 0.1930 0.2319 0.3304
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in RColorBrewer::brewer.pal(n, pal): n too large, allowed maximum for palette Set1 is 9
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(n, pal): n too large, allowed maximum for palette Set1 is 9
## Returning the palette you asked for with that many colors
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 8554 rows containing missing values (geom_point).
## Warning: Removed 8554 rows containing missing values (stat_summary).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 591 rows containing missing values (geom_point).
## Warning: Removed 591 rows containing non-finite values (stat_boxplot).
## Warning: Removed 591 rows containing non-finite values (stat_boxplot).
## Warning: Removed 858 rows containing missing values (geom_point).
The average Borrower Rate was decreasing significantly from 2012 to 2014
The Lower 10% of borrower rates have remained relatively steady
The average Loan amount went down steeply in Q4 2008 and has been rising steadily since then.
Borrower Rate varies significantly by CreditGrade.
Borrower Rates seem to vary accross states quite a bit. The median borrower rate by state varies from approximately .15 to .2
Occupation has a strong influence on BorrowerRate. Occupations with a higher level of education (i.e. engineer, computer programmer, judge etc..) have borrower rates on the lower end of the spectrum while occupations with a lower level of education (i.e. college freshman, Nurse’s Aide, Bus Driver, Laborer), have BorrowerRates on the higher end of the spectrum. Median Borrower rate by occupation varies from approximately .125 to .225 ### Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?
The strongest Relationship was between the Borrower Rate and Credit Grade, although
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 569 rows containing missing values (stat_summary).
## Warning: Removed 569 rows containing missing values (stat_summary).
## Warning: Removed 569 rows containing missing values (stat_summary).
## Warning: Removed 569 rows containing missing values (stat_summary).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## facet_wrap(LoanOriginationQuarter)
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 591 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 315 rows containing missing values (geom_point).
## Warning: Removed 254 rows containing missing values (geom_point).
## Warning: Removed 22 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 315 rows containing missing values (stat_summary).
## Warning: Removed 254 rows containing missing values (stat_summary).
## Warning: Removed 22 rows containing missing values (stat_summary).
## Warning: Removed 315 rows containing missing values (geom_point).
## Warning: Removed 254 rows containing missing values (geom_point).
## Warning: Removed 22 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
##
## Pearson's product-moment correlation
##
## data: BorrowerRate and LoanOriginalAmount
## t = -26.632, df = 28971, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1658044 -0.1433251
## sample estimates:
## cor
## -0.1545848
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 591 rows containing missing values (stat_summary).
## Warning: Removed 591 rows containing missing values (stat_summary).
## Warning: Removed 591 rows containing missing values (stat_summary).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 17 rows containing missing values (geom_point).
## Warning: Removed 24 rows containing missing values (geom_point).
## Warning: Removed 41 rows containing missing values (geom_point).
## Warning: Removed 44 rows containing missing values (geom_point).
## Warning: Removed 20 rows containing missing values (geom_point).
## Warning: Removed 89 rows containing missing values (geom_point).
## Warning: Removed 101 rows containing missing values (geom_point).
## Warning: Removed 92 rows containing missing values (geom_point).
## Warning: Removed 286 rows containing missing values (geom_point).
## Warning: Removed 463 rows containing missing values (geom_point).
## Warning: Removed 67 rows containing missing values (geom_point).
## Warning: Removed 41 rows containing missing values (geom_point).
## Warning: Removed 157 rows containing missing values (geom_point).
## Warning: Removed 142 rows containing missing values (geom_point).
## Warning: Removed 194 rows containing missing values (geom_point).
## Warning: Removed 143 rows containing missing values (geom_point).
## Warning: Removed 181 rows containing missing values (geom_point).
## Warning: Removed 169 rows containing missing values (geom_point).
## Warning: Removed 276 rows containing missing values (geom_point).
## Warning: Removed 351 rows containing missing values (geom_point).
## Warning: Removed 508 rows containing missing values (geom_point).
## Warning: Removed 580 rows containing missing values (geom_point).
## Warning: Removed 571 rows containing missing values (geom_point).
## Warning: Removed 541 rows containing missing values (geom_point).
## Warning: Removed 328 rows containing missing values (geom_point).
## Warning: Removed 457 rows containing missing values (geom_point).
## Warning: Removed 440 rows containing missing values (geom_point).
## Warning: Removed 941 rows containing missing values (geom_point).
## Warning: Removed 905 rows containing missing values (geom_point).
## Warning: Removed 382 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 5 rows containing missing values (geom_point).
## Warning: Removed 25 rows containing missing values (geom_point).
## Warning: Removed 39 rows containing missing values (geom_point).
## Warning: Removed 97 rows containing missing values (geom_point).
## Warning: Removed 142 rows containing missing values (geom_point).
## Warning: Removed 180 rows containing missing values (geom_point).
## Warning: Removed 121 rows containing missing values (geom_point).
## Warning: Removed 146 rows containing missing values (geom_point).
## Warning: Removed 119 rows containing missing values (geom_point).
## Warning: Removed 309 rows containing missing values (geom_point).
## Warning: Removed 475 rows containing missing values (geom_point).
## Warning: Removed 69 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 43 rows containing missing values (geom_point).
## Warning: Removed 163 rows containing missing values (geom_point).
## Warning: Removed 149 rows containing missing values (geom_point).
## Warning: Removed 203 rows containing missing values (geom_point).
## Warning: Removed 145 rows containing missing values (geom_point).
## Warning: Removed 188 rows containing missing values (geom_point).
## Warning: Removed 177 rows containing missing values (geom_point).
## Warning: Removed 289 rows containing missing values (geom_point).
## Warning: Removed 372 rows containing missing values (geom_point).
## Warning: Removed 535 rows containing missing values (geom_point).
## Warning: Removed 615 rows containing missing values (geom_point).
## Warning: Removed 609 rows containing missing values (geom_point).
## Warning: Removed 581 rows containing missing values (geom_point).
## Warning: Removed 343 rows containing missing values (geom_point).
## Warning: Removed 485 rows containing missing values (geom_point).
## Warning: Removed 469 rows containing missing values (geom_point).
## Warning: Removed 945 rows containing missing values (geom_point).
## Warning: Removed 905 rows containing missing values (geom_point).
## Warning: Removed 415 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 5 rows containing missing values (geom_point).
## Warning: Removed 25 rows containing missing values (geom_point).
## Warning: Removed 44 rows containing missing values (geom_point).
## Warning: Removed 100 rows containing missing values (geom_point).
## Warning: Removed 144 rows containing missing values (geom_point).
## Warning: Removed 180 rows containing missing values (geom_point).
## Warning: Removed 123 rows containing missing values (geom_point).
## Warning: Removed 146 rows containing missing values (geom_point).
## Warning: Removed 120 rows containing missing values (geom_point).
## Warning: Removed 309 rows containing missing values (geom_point).
## Warning: Removed 475 rows containing missing values (geom_point).
## Warning: Removed 69 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 43 rows containing missing values (geom_point).
## Warning: Removed 163 rows containing missing values (geom_point).
## Warning: Removed 150 rows containing missing values (geom_point).
## Warning: Removed 203 rows containing missing values (geom_point).
## Warning: Removed 145 rows containing missing values (geom_point).
## Warning: Removed 188 rows containing missing values (geom_point).
## Warning: Removed 177 rows containing missing values (geom_point).
## Warning: Removed 293 rows containing missing values (geom_point).
## Warning: Removed 372 rows containing missing values (geom_point).
## Warning: Removed 535 rows containing missing values (geom_point).
## Warning: Removed 615 rows containing missing values (geom_point).
## Warning: Removed 609 rows containing missing values (geom_point).
## Warning: Removed 581 rows containing missing values (geom_point).
## Warning: Removed 344 rows containing missing values (geom_point).
## Warning: Removed 485 rows containing missing values (geom_point).
## Warning: Removed 469 rows containing missing values (geom_point).
## Warning: Removed 945 rows containing missing values (geom_point).
## Warning: Removed 905 rows containing missing values (geom_point).
## Warning: Removed 415 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 127 rows containing missing values (geom_point).
## Warning: Removed 423 rows containing missing values (geom_point).
## Warning: Removed 517 rows containing missing values (geom_point).
## Warning: Removed 578 rows containing missing values (geom_point).
## Warning: Removed 881 rows containing missing values (geom_point).
## Warning: Removed 898 rows containing missing values (geom_point).
## Warning: Removed 710 rows containing missing values (geom_point).
## Warning: Removed 754 rows containing missing values (geom_point).
## Warning: Removed 838 rows containing missing values (geom_point).
## Warning: Removed 1220 rows containing missing values (geom_point).
## Warning: Removed 1144 rows containing missing values (geom_point).
## Warning: Removed 181 rows containing missing values (geom_point).
## Warning: Removed 5 rows containing missing values (geom_point).
## Warning: Removed 193 rows containing missing values (geom_point).
## Warning: Removed 508 rows containing missing values (geom_point).
## Warning: Removed 501 rows containing missing values (geom_point).
## Warning: Removed 560 rows containing missing values (geom_point).
## Warning: Removed 516 rows containing missing values (geom_point).
## Warning: Removed 672 rows containing missing values (geom_point).
## Warning: Removed 771 rows containing missing values (geom_point).
## Warning: Removed 981 rows containing missing values (geom_point).
## Warning: Removed 1225 rows containing missing values (geom_point).
## Warning: Removed 1655 rows containing missing values (geom_point).
## Warning: Removed 1845 rows containing missing values (geom_point).
## Warning: Removed 2132 rows containing missing values (geom_point).
## Warning: Removed 2312 rows containing missing values (geom_point).
## Warning: Removed 1550 rows containing missing values (geom_point).
## Warning: Removed 2805 rows containing missing values (geom_point).
## Warning: Removed 3566 rows containing missing values (geom_point).
## Warning: Removed 5335 rows containing missing values (geom_point).
## Warning: Removed 4780 rows containing missing values (geom_point).
## Warning: Removed 1732 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 127 rows containing missing values (geom_point).
## Warning: Removed 423 rows containing missing values (geom_point).
## Warning: Removed 517 rows containing missing values (geom_point).
## Warning: Removed 578 rows containing missing values (geom_point).
## Warning: Removed 881 rows containing missing values (geom_point).
## Warning: Removed 898 rows containing missing values (geom_point).
## Warning: Removed 710 rows containing missing values (geom_point).
## Warning: Removed 754 rows containing missing values (geom_point).
## Warning: Removed 838 rows containing missing values (geom_point).
## Warning: Removed 1220 rows containing missing values (geom_point).
## Warning: Removed 1144 rows containing missing values (geom_point).
## Warning: Removed 181 rows containing missing values (geom_point).
## Warning: Removed 5 rows containing missing values (geom_point).
## Warning: Removed 193 rows containing missing values (geom_point).
## Warning: Removed 508 rows containing missing values (geom_point).
## Warning: Removed 501 rows containing missing values (geom_point).
## Warning: Removed 560 rows containing missing values (geom_point).
## Warning: Removed 516 rows containing missing values (geom_point).
## Warning: Removed 672 rows containing missing values (geom_point).
## Warning: Removed 771 rows containing missing values (geom_point).
## Warning: Removed 981 rows containing missing values (geom_point).
## Warning: Removed 1225 rows containing missing values (geom_point).
## Warning: Removed 1655 rows containing missing values (geom_point).
## Warning: Removed 1845 rows containing missing values (geom_point).
## Warning: Removed 2132 rows containing missing values (geom_point).
## Warning: Removed 2312 rows containing missing values (geom_point).
## Warning: Removed 1550 rows containing missing values (geom_point).
## Warning: Removed 2805 rows containing missing values (geom_point).
## Warning: Removed 3566 rows containing missing values (geom_point).
## Warning: Removed 5335 rows containing missing values (geom_point).
## Warning: Removed 4780 rows containing missing values (geom_point).
## Warning: Removed 1732 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in scale$trans$trans(x): NaNs produced
## Warning: Removed 38 rows containing missing values (geom_point).
## Warning: Removed 133 rows containing missing values (geom_point).
## Warning: Removed 156 rows containing missing values (geom_point).
## Warning: Removed 242 rows containing missing values (geom_point).
## Warning: Removed 333 rows containing missing values (geom_point).
## Warning: Removed 363 rows containing missing values (geom_point).
## Warning: Removed 241 rows containing missing values (geom_point).
## Warning: Removed 304 rows containing missing values (geom_point).
## Warning: Removed 291 rows containing missing values (geom_point).
## Warning: Removed 512 rows containing missing values (geom_point).
## Warning: Removed 605 rows containing missing values (geom_point).
## Warning: Removed 94 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 64 rows containing missing values (geom_point).
## Warning: Removed 209 rows containing missing values (geom_point).
## Warning: Removed 205 rows containing missing values (geom_point).
## Warning: Removed 255 rows containing missing values (geom_point).
## Warning: Removed 203 rows containing missing values (geom_point).
## Warning: Removed 273 rows containing missing values (geom_point).
## Warning: Removed 303 rows containing missing values (geom_point).
## Warning: Removed 433 rows containing missing values (geom_point).
## Warning: Removed 537 rows containing missing values (geom_point).
## Warning: Removed 771 rows containing missing values (geom_point).
## Warning: Removed 867 rows containing missing values (geom_point).
## Warning: Removed 940 rows containing missing values (geom_point).
## Warning: Removed 937 rows containing missing values (geom_point).
## Warning: Removed 574 rows containing missing values (geom_point).
## Warning: Removed 896 rows containing missing values (geom_point).
## Warning: Removed 1010 rows containing missing values (geom_point).
## Warning: Removed 1707 rows containing missing values (geom_point).
## Warning: Removed 1590 rows containing missing values (geom_point).
## Warning: Removed 705 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in scale$trans$trans(x): NaNs produced
## Warning: Removed 64 rows containing missing values (geom_point).
## Warning: Removed 202 rows containing missing values (geom_point).
## Warning: Removed 79 rows containing missing values (geom_point).
## Warning: Removed 93 rows containing missing values (geom_point).
## Warning: Removed 79 rows containing missing values (geom_point).
## Warning: Removed 100 rows containing missing values (geom_point).
## Warning: Removed 373 rows containing missing values (geom_point).
## Warning: Removed 54 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 38 rows containing missing values (geom_point).
## Warning: Removed 135 rows containing missing values (geom_point).
## Warning: Removed 140 rows containing missing values (geom_point).
## Warning: Removed 185 rows containing missing values (geom_point).
## Warning: Removed 123 rows containing missing values (geom_point).
## Warning: Removed 163 rows containing missing values (geom_point).
## Warning: Removed 152 rows containing missing values (geom_point).
## Warning: Removed 246 rows containing missing values (geom_point).
## Warning: Removed 308 rows containing missing values (geom_point).
## Warning: Removed 422 rows containing missing values (geom_point).
## Warning: Removed 489 rows containing missing values (geom_point).
## Warning: Removed 511 rows containing missing values (geom_point).
## Warning: Removed 472 rows containing missing values (geom_point).
## Warning: Removed 304 rows containing missing values (geom_point).
## Warning: Removed 393 rows containing missing values (geom_point).
## Warning: Removed 402 rows containing missing values (geom_point).
## Warning: Removed 746 rows containing missing values (geom_point).
## Warning: Removed 679 rows containing missing values (geom_point).
## Warning: Removed 330 rows containing missing values (geom_point).
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning: Removed 8 rows containing missing values (geom_point).
## Warning: Removed 51 rows containing missing values (geom_point).
## Warning: Removed 56 rows containing missing values (geom_point).
## Warning: Removed 121 rows containing missing values (geom_point).
## Warning: Removed 163 rows containing missing values (geom_point).
## Warning: Removed 190 rows containing missing values (geom_point).
## Warning: Removed 138 rows containing missing values (geom_point).
## Warning: Removed 158 rows containing missing values (geom_point).
## Warning: Removed 127 rows containing missing values (geom_point).
## Warning: Removed 321 rows containing missing values (geom_point).
## Warning: Removed 482 rows containing missing values (geom_point).
## Warning: Removed 72 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 44 rows containing missing values (geom_point).
## Warning: Removed 165 rows containing missing values (geom_point).
## Warning: Removed 153 rows containing missing values (geom_point).
## Warning: Removed 206 rows containing missing values (geom_point).
## Warning: Removed 148 rows containing missing values (geom_point).
## Warning: Removed 192 rows containing missing values (geom_point).
## Warning: Removed 179 rows containing missing values (geom_point).
## Warning: Removed 294 rows containing missing values (geom_point).
## Warning: Removed 376 rows containing missing values (geom_point).
## Warning: Removed 539 rows containing missing values (geom_point).
## Warning: Removed 619 rows containing missing values (geom_point).
## Warning: Removed 615 rows containing missing values (geom_point).
## Warning: Removed 587 rows containing missing values (geom_point).
## Warning: Removed 346 rows containing missing values (geom_point).
## Warning: Removed 487 rows containing missing values (geom_point).
## Warning: Removed 472 rows containing missing values (geom_point).
## Warning: Removed 946 rows containing missing values (geom_point).
## Warning: Removed 907 rows containing missing values (geom_point).
## Warning: Removed 423 rows containing missing values (geom_point).
I discovered that the prosper ratings
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
## Warning in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels)
## else paste0(labels, : duplicated levels in factors are deprecated
This prosper loan data is a rich dataset that reveals tons of information about the distribution of loans over time.